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Performance  Evaluation  Metrics  for 

Information  Systems  Development: 

A  Principal-Agent  Model 

Rajiv  D.  Banker  and  Chris  F.  Kemerer 

Abstract 

The  information  systems  (IS)  development  activity  in  large  organizations  is  a  source  of  increasing 
cost  and  concern  to  management  IS  development  projects  are  often  over- budget,  late,  costly  to 
maintain,  and  not  done  to  the  satisfaction  of  the  requesting  client.  These  problems  exist,  in  part, 
due  to  the  organization  of  the  IS  development  process,  where  information  systems  development  is 
typically  assigned  by  the  client  (principal)  to  a  developer  (agent).  The  inability  to  direcdy  monitor 
the  agent  requires  the  use  of  multiple  performance  measures,  or  metrics,  to  represent  the  agent's 
actions  to  the  principal. 

This  paper  develops  a  principal-agent  model  (based  on  information  economics)  that  is  solved  to 
provide  the  set  of  decision  criteria  for  the  principal  to  use  to  appropriately  weight  each  of  the 
multiple  metrics  in  order  to  provide  an  incentive  compatible  contract  for  the  agent.  These  criteria 
include  the  sensitivity  and  the  precision  of  the  performance  metric.  After  presenting  the  formal 
model,  some  current  software  development  metrics  are  discussed  to  illustrate  how  the  model  can  be 
used  to  provide  a  theoretical  foundation  and  a  formal  vocabulary  for  performance  metric 
evaluation.  The  model  can  also  be  used  in  a  positive  manner  to  suggest  explanations  for  the 
current  relative  emphasis  in  practice  on  particular  metrics. 
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I.     INTRODUCTION 

The  information  systems  (IS)  development  activity  in  large  organizations  is  a  source  of  increasing 
cost  and  concern  to  management  IS  development  projects'  are  often  over  budget,  late,  costly  to 
maintain,  and  not  done  to  the  satisfaction  of  the  requesting  cUent.  It  has  been  suggested  that  these 
problems  exist,  in  part,  due  to  the  organization  of  the  IS  development  process,  where  information 
systems  development  is  typically  assigned  by  the  client  (principal)  to  a  developer  (agent) 
[Gurbaxani  and  Kemerer  1989, 1990]  [Beath  and  Straub  1989]  [Klepper  1990]  [Whang  1990]. 
This  agency  relationship  imposes  the  standard  agency  costs  due  to  the  goal  incongruence  of  the 
principal  and  the  agent  and  the  imperfect  monitoring  of  the  agent's  actions  by  the  principal.  The 
inability  to  directly  monitor  the  agent  requires  the  use  of  indirect  jjerformance  measures,  or  metrics 
to  represent  the  agent's  actions  to  the  principal.  To  appropriately  represent  information  systems 
development,  it  will  be  necessary  for  the  principal  to  employ  multiple  metrics  to  account  for  the 
multi-dimensional  nature  of  the  information  systems  development  product  [Bakos  and  Kemerer 
1991]  [Mendelson  1991].  The  principal  is  then  confronted  with  the  problem  of  assigning 
individual  weights  to  these  multiple  metrics  in  order  to  provide  incentives  for  the  agent. 

The  difficulties  that  principals  have  in  assigning  weights  can  be  easily  illustrated  with  a  simple 
example.  It  is  well-documented  that  the  maintenance  costs  associated  with  information  systems 
typically  exceed  the  development  cost  over  the  system's  useful  life  [Swanson  and  Beath  1990]. 
Yet,  in  practice,  software  developers  are  typically  evaluated  by  criteria  such  as  on-time  and  on- 
budget  delivery  of  the  initial  system,  and  rarely,  if  ever,  on  the  likely  maintainability  of  the  system 
that  they  have  just  delivered  [Gode  et  al.  1990].  Izzo  notes  that,  "Maintenance,  long  considered 
one  of  the  most  important  product  support  services  a  business  provides,  is  considered  a  secondary 
responsibility  in  information  systems"  [1987,  p.  25].  It  is  apparent  paradoxes  such  as  this  one  that 
suggest  that  principals  have  difficulty  with  the  multi-dimensional  nature  of  IS  development 
performance  evaluation. 

This  paper  develops  a  principal-agent  model  (based  on  information  economics)  that  is  analyzed  to 
identify  the  set  of  decision  criteria  for  the  principal  to  use  to  develop  the  contract.  These  criteria  are 
the  sensitivity  and  precision  of  the  performance  metric.  This  formal  model  is  develoj)ed  and 
shown  in  Section  II.  The  model  provides  a  theoretical  foundation  and  a  formal  vocabulary  for 


^By  "development"  what  is  meant  are  all  the  activities  that  constitute  the  systems  life  cycle,  including  systems 
maintenance.  Activities  solely  related  to  new  systems  exclusive  of  any  maintenance  activity  will  be  referred  to  as 
"new  development". 


performance  metric  evaluation  in  the  general  context  of  a  multidimensional  performance  contract 
Section  in  first  develops  a  simple  framework  of  IS  development  project  performance  metrics. 
Then,  two  actual  IS  development  organizations  are  described  in  terms  of  the  performance  metrics 
they  have  in  current  use.  One  of  the  IS  development  groups  is  internal  to  an  organization  and  one 
is  an  external  IS  development  services  provider.  The  results  of  the  model  are  applied  to  these 
organizations  to  illustrate  the  model's  use  in  a  positive  manner  to  suggest  explanations  for  the 
current  relative  emphasis  in  practice  on  particular  metrics.  Section  FV  presents  a  broader 
discussion  of  both  the  ramifications  and  limitations  of  the  model  outside  the  context  of  the  two 
organizations  studied.  Finally,  some  concluding  remarks  are  presented  in  Section  V. 

II.  MODEL 

Information  systems  (IS)  development  is  modeled  as  a  principal-agent  problem,  with  the  client 
(the  principal)  desiring  information  systems  to  be  developed  to  meet  her  goals^.  She  contracts  with 
an  IS  development  manager  (the  agent)  to  perform  this  work,  due  to  specialized  expertise  on  the 
part  of  the  agent.  The  normal  principal-agent  model  assumptions  are  made,  i.e.,  that  the  goals  of 
the  agent  are  assumed  to  be  only  imperfectiy  aligned  with  those  of  the  principal  (goal 
incongruence)  and  that  the  agent's  actions  can  only  be  imperfectiy  observed  by  the  principal 
(information  asymmetries).    Considerable  prior  work  exists  in  this  area,  and  the  interested  reader 
is  referred  to  [Ross  1973]  [Jensen  and  Meckling  1976]  [Holmstrom  1979]  [Harris  and  Raviv 
1979]  and  [Holmstrom  and  Milgrom  1990]. 

The  assumption  is  made  that  the  principal  is  interested  in  tiie  outcome  along  n  dimensions,  which 

are  represented  by  the  vector  x  =  (xi,-  •  -Xi,-  •  -Xn).  The  agent  can  increase  the  likelihood  of 
obtaining  a  better  outcome  xj  by  devoting  more  effort,  aj,  towards  that  outcome.  More  formally, 

dmjdai  >  0,  3mi/3aj  =  0,  ij  =  1,2,-  •  -n,  j^ti 
where  mj  =  E  (xjlai)  =  expected  value  of  outcome  x^. 

The  outcomes  cannot  be  observed  jointiy  by  the  principal  and  the  agent  with  perfect  accuracy. 
The  agent's  efforts  a  =  (aj,-  •  aj,-  •  an)  cannot  be  perfectiy  observed  by  the  principal  witiiout 
incurring  prohibitive  monitoring  costs.  For  performance  evaluation  purposes,  therefore. 


^Following  [Beath  and  Straub  1989],  the  use  of  "she/her"  will  refer  lo  the  principal,  and  "he/him"  will  refer  to  the 
agent  in  order  to  make  pronoun  references  easier  to  follow.  The  model  will  focus  on  only  these  two  parties,  and 
excludes  from  consideration  any  possible  agency  relationship  between  the  principal  requesting  the  work  and  her 
superior,  for  instance,  as  suggested  by  [Gurbaxani  and  Kemerer  1990].  Therefore,  it  is  applicable  to  situations 
involving  either  external  or  internal  developers. 


appropriate  metrics,  y  =  (y  i  ,•  •  yi,-  •  yn)  are  developed  to  provide  (imperfect)  signals  about  the  true 
outcomes. 

More  formally, 

yi  =  Xi  +  Ei  ,i=  1,2,  •••  n 

where  the  Ej  represent  random  variations  (noise)  for  each  of  the  n  outcomes  of  interest. 

In  order  to  provide  incentives  for  the  agent  to  exert  greater  effort  to  produce  higher  levels  of  the 
outcomes  of  interest  to  the  principal,  the  principal  bases  the  agent's  compensation  on  the  jointly 
observable  metrics: 

s  =  s  (y) 

where  s  represents  the  agent's  compensation.  The  monetary  value  of  the  outcomes  to  the  principal 
is  represented  by  w,  where  w  is  a  function  of  x,  and  therefore  the  risk-neutral  principal  seeks  to 
maximize  the  expected  value  of  w(x)  -  s(y).  The  agent  is  assumed  to  be  risk  and  effort  averse,  and 
he  must  be  compensated  at  the  end  of  the  contractual  time  period  [Lambert  1983].  The  principal 
understands  the  agent  to  be  economically  rational,  and  knows  that  a  compensation  contract  based 
on  y  will  influence  the  agent's  actions  a.  The  agent  seeks  to  maximize  the  expected  value  of: 

u(s) -  v(a) 

where  u(-)  represents  his  or  her  utility  for  compensation,  s(-),  and  v(-)  represents  his  or  her 

disutility  for  effort,  with  u'(-)  >  0,  u"(-)  <  0,  and  v(-)  >  0.  The  principal's  problem  can  now  be 

formulated  as  follows: 

(1)        max  E[w(x)-s(y)] 

s(-),a 
subject  to: 

(2a)      E[u(s(y))-v(a)]>Uo 

(2b)      3  E[u(s(y)  -  v(a)]  /  3ai  =  0  for  1=1,  ...n 

(2c)      s  e  [SL,  sh],  a  €  [at,  an] 

The  objective  function  simply  maximizes  the  expected  benefit,  w(x),  to  the  principal  of  the 
information  systems  outcomes,  x,  net  of  compensation,  s,  paid  to  the  agent.  The  first  constraint 
("individual  rationality")  ensures  that  the  contract  guarantees  the  agent  a  minimum  expected  utility 
level,  Uq,  equalling  at  least  his  best  alternative  employment  possibility.  The  next  set  of  n 
constraints  ("incentive  compatibility")  ensures  that  the  agent's  effort  level  choices,  aj,  i=l,2,-  •  •  n 
maximize  his  own  expected  utility  level,  and  thus  provide  incentive  compatibility.  This  set  of  first 


order  optimization  conditions  is  assumed  to  characterize  the  optimal  action  choice  for  the  agent 
[Rogerson  1985].  The  final  constraints  specify  a  bounded  feasible  space  to  ensure  the  existence  of 
an  optimal  solution  to  the  principal's  constrained  maximization  problem  [Holmstrom  1979]. 

The  Euler-Lagrange  optimization  conditions  for  the  mathematical  program  above  are  given  by  the 
following: 


(3) 


1        ^       n        ([8f(x,y;a)/aai]dx 
u(s)  ,^  ffx.v:aMx 


i=l 
.2 


jf(x,y;a)dj 


(4)  -^  E[m(x)  -  s(y)]  +  ^  ^^j    d^  E[u(s)  -  v(a)]  =  0 


'''  -^       S 


for  each  i=l,...n 


Here,  X  and  iJ-j,  i=l,...n,  are  Lagrange  multipliers  for  the  (n+1)  constraints.  The  joint  probability 
density  function  of  the  outcomes  x  and  the  metrics  y  is  embodied  in  f(-),  and  9f()/3ai  denotes  its 
partial  derivative  with  respect  to  effort  dimension,  aj.  The  condition  in  (3)  reflects  pointwise 
optimization  for  each  observable  value  of  the  metric  vector  y.  Since  the  actual  outcomes  x  are  not 
jointiy  observable,  the  incentive  contract  cannot  be  based  on  it,  and  therefore  integration  is 
performed  over  all  possible  values  of  x  in  condition  (3).  Let 

(5)       f(x,y;a)  =  g(xly;a)h(y;a) 

where  g(-)  is  the  probability  density  function  of  x  conditional  on  the  observed  value  of  y,  and  h() 
is  the  marginal  probability  density  function  of  y.  Now, 

(6)  )[af()/aajdx =)[ag()/aai]h()dx +)g()[ah()/aai]dx 

But,  J  g(-)dx  =  1  because  g(-)  is  a  probability  density  function,  and  therefore, 

(7)  |[ag()/aai]h()dx  =  h()  —  f g( )dx  =  0 

aaj 

It  follows  from  (5),  (6)  and  (7)  that 


([3f(x,y;a)/aai]dx  _  [ah  (y;a)/9ai] 
[f(x,y;a)dx  h(y;a) 

Returning  to  the  condition  in  (3): 

^^^;r(i)-^\^^'       h(y;a) 
Differentiating  (9)  with  respect  to  a  particular  y;,  j=l ,-  •  n,  yields 

ao)[.j^^][g!!ly)].y^.A[^h(.)/aa.] 

(u'(s))2  8yj  .=1       dyj         h() 

In  order  to  derive  the  distribution  of  the  performance  metrics  y,  some  additional  structure  is 
imposed.  In  particular,  it  is  assumed  that  the  stochastic  variables  xj,  given  the  agent's  choice  of 
efforts  ai,  are  statistically  independent^  and  are  normally  distributed  with  means  mj  and  variances 

Tli .  The  measurement  error  ej  in  the  metric  yj  is  also  assumed  to  be  distributed  normally  with  mean 

zero  and  variance  of.  The  errors  Ej  are  assumed  to  be  distributed  independent  of  xj,  xj  and  e;,  j?ti. 
It  foUows,  therefore,  that  the  metrics  yi  =  xj  +  £{  are  distributed  independent  of  the  other  stochastic 
variables  described  above. 

The  distribution  of  each  y[,  being  a  convolution  of  two  random  variables  following  a  bivariate 
normal  distribution,  is  itself  normal  with  mean  E(yi)  =  E(xi)  +  E(ei)  =  mj  +  0  =  mj  and  variance 

V(yi)  =  V(xi)  +  V(ei)  =  CHj  ^.  of  )."*  The  probability  density  function  hiCyjIai)  is  then  given  by: 

Cn  hi(yilai)  =  - 1/2  Cn  27tV(yi)  -  [yj  -  m.(a,)]2/  2  V(yi) 

Further,  since  the  yj  are  independentiy  distributed, 

h(yla)  =  n  hi(yilai) 
i=l 

and 


•^The  agent  will  trade  off  allocations  of  efforts  ai  to  different  activities  i,  and  to  that  extent  the  model  captures  the 
interdependent  nature  of  the  outcomes.  The  statistical  independence  assumptions  is  maintained  for  expositional 
convenience;  the  principal  results  extend  to  the  case  of  correlated  stochastic  variables. 

^In  the  analysis  presented  here,  it  is  assumed  that  only  the  mean  mjCaj)  is  affected  by  the  agent's  actions.  However, 
this  approach  could  be  extended  to  address  the  case  where  the  variance  of  x^  can  be  influenced  by  the  agent's  actions. 


therefore. 


and 


9h(yla)/aai  _  3  In  h(yla) 
h(yla)     "       aa- 

^  3  In  hi(yilai) 
dai 

=  [yi-mi(ai)][ami(ai)/aaj  /V(yi) 

=  [yi  -  mi(ai)]  [ami(ai)/3ai]  /  [t\(  +  o?] 

d   [9h(yla)/9ai] 


9yj       h(yla) 


=  0  forJ5«ii 


=  [ami(ai)/9ai]/[Ti?  +  of]  forj=i 

It  follows  from  equation  (10)  then  that 

as*(y)     -(u'())2    ^i  [ami(ai)/3ai)] 


(11) 


ayi         ""(•)    ■      [Tl?  +  a?] 


Since  the  right  hand  side  of  the  above  equation  (1 1)  is  independent  of  y,  it  follows  that  s*(y)  can 
be  written  as  s*(y)  =  si*(S2*  (y))  where  S2*(y)  is  linear  in  y  and  can  be  interpreted  as  the 

aggregated  performance  evaluation  metric.  It  also  follows  from  equation  (11)  that 

(12)     s*2(y)  =  X  pi^iyi 

i=l 

where  Pi  =  ["Hi^  +  ai^]-'  is  the  precision  of  the  metric  yj  which  is  inversely  related  to  V(xi)  and 
V(ei).  Precision  is  a  measure  of  the  degree  to  which  the  agent  can  predict  the  value  of  the  metric, 

given  a  set  of  actions.  The  lack  of  precision,  or  increase  in  the  variance,  can  be  seen  as  being  due 
to  two  sources.  The  first  is  that  the  relationship  between  some  outcome  xj  and  corresponding 
action  aj  may  be  inherently  non-controllable  by  the  agent,  due  to  the  effect  of  factors  outside  his 
purview.  A  second  source  may  be  a  lack  of  accuracy,  or  "noise"  in  measuring  xj,  i.e.,  large 
variations  in  the  values  of  ej.  More  formally,  the  inverse  of  the  precision  measure  can  be 
decomposed  into  its  two  constituent  components,  as  follows: 


var(yila)  =  var(xila)  +  var  (eO 

where  the  first  term  on  the  RHS  corresponds  to  the  controllability  component  and  the  second  term 
corresponds  to  the  accuracy  component^.  A  metric  with  higher  precision  will  receive  a  higher 
weight  in  the  aggregate  since  the  metric  will  be  more  informative  about  the  agent's  action  choice. 
This  is  true  whether  the  greater  precision  results  from  greater  controllability,  greater  accuracy,  or 
some  combination. 

In  equation  (12),  ^i  =  |ii3mi(ai)  /  3ai  is  the  sensitivity  of  the  outcome  xj  (and  the  metric  yj)  to  the 
agent's  action  aj.  Using  standard  sensitivity  analysis  in  optimization  theory  [loffe  and  Tihomirov 
1979,  pp.  292-298]  the  ^i  is  seen  to  correspond  to  the  change  in  the  principal's  expected  utility 

relative  to  the  change  in  the  agent's  expected  utility  when,  at  the  optimal  solution,  the  agent's 
incentive  compatibility  constraint  for  the  choice  of  aj  is  perturbed  marginally.  In  other  words,  ^i  is 
the  marginal  value  to  the  principal  of  providing  the  incentive  to  the  agent  to  increase  his  effort  ai  by 
a  marginal  unit.  The  principal  will  want  to  encourage  agents'  actions  that  most  increase  the  final 
payoff  to  the  principal,  and  therefore  metrics  that  correspond  to  these  "high  payoff  activities  that 
are  most  sensitive  to  the  agent's  actions  will  be  more  highly  weighted  relative  to  those  with  less 
impact^. 

For  a  metric  to  exhibit  high  sensitivity  it  must  exhibit  significant  changes  during  the  evaluation 
period  in  response  to  the  agent's  actions.  A  very  sensitive  metric  would  show  a  large  change  in  the 
value  to  the  principal,  on  average,  for  even  a  small  additional  amount  of  disutility  to  the  agent 
resulting  from  an  increase  in  effort    In  terms  of  the  optimal  contract,  more  weight  will  be  placed 
on  metrics  with  high  sensitivity  relative  to  those  with  low  sensitivity  [Banker  and  Datar  1989]^. 

Since  the  true  levels  of  the  agent's  efforts,  a,  are  unobserved,  for  incentive  contracting  purposes 
the  principal  and  the  agent  agree  on  a  set  of  performance  evaluation  metrics,  y,  that  can  be 
observed.  This  multi-dimensionality  poses  a  dilemma  for  the  principal:  how  to  establish  a  contract 
that  maximizes  the  agent's  efforts  appropriately  across  dimensions;  in  particular,  which  metrics  to 
emphasize  most  in  the  agent's  performance  evaluation. 


^Precision  is  the  reciprocal  of  the  sum  of  the  variances,  which  is  generally  not  equal  to  the  sum  of  the  reciprocals  of 

the  variances. 

^Wilh  multi-dimensional  tasks  the  agent  has  tradeoff  possibilities  and  it  is  therefore  possible  that  a  particular  ^i 

could  be  very  small.  Therefore,  it  would  be  optimal  at  the  margin  to  not  devote  additional  effort  to  that  task.  In  the 

event  that  the  precision  and  sensitivity  of  the  associated  metric  are  low,  effort  devoted  to  that  dimension  is  likely  to 

become  extremely  small.  This  result  is  complementary  to  that  of  Holmstrom  and  Milgrom  [1990]. 

^This  extends  the  work  of  Banker  and  Datar  (1989)  to  the  case  of  imperfect  performance  metrics  for  multiple 

outcomes  of  interest 


In  order  to  effect  the  appropriate  behaviors,  the  principal  will  base  the  agent's  compensation  in  part 
upon  the  value  of  the  performance  metrics  y.^  Since  the  y  are  likely  to  be  imperfect  surrogates  for 
the  X  and  underlying  effort  choices  a,  some  uncertainty  is  present  Therefore,  an  extreme  form  of 
compensation  contract  involving  total  reliance  on  performance  evaluation  metrics  is  unlikely,  since 
this  places  extreme  risk  on  the  agent,  who  is  assumed  to  be  risk  averse.  Conversely,  however,  the 
opposite  extreme  of  zero  reliance  on  the  performance  evaluation  metrics  is  also  unlikely,  as  this 
does  not  allow  the  principal  to  offer  any  incentives  for  appropriate  behavior. 

However,  within  the  range  of  likely  contract  forms,  there  is  still  room  for  considerable  variation  in 
terms  of  the  choice  of  individual  metrics  (the  y/s)  and  the  weight  that  is  to  be  assigned  each  metric 

in  the  compensation  scheme.  A  final  potential  issue  is  the  mapping  of  the  weighted  performance 
evaluation  metrics  to  the  actual  rewards.  However,  as  shown  above  in  equation  {12},  this  third 
step  is  straightforward  in  this  analysis,  as  the  rewards  will  depend  directly  upon  the  weighted 
aggregate  of  the  individual  y,s.  Therefore,  the  critical  decision  problem  for  the  principal  is 
assigning  the  relative  weights. 

III.  APPLICATION  OF  THE  MODEL  TO  MIS  DEVELOPMENT 

In  this  three  part  section  the  model  developed  in  Section  U  is  applied  to  the  domain  of  Management 
Information  Systems  (MIS)  development.  Section  A  describes  the  overall  dimensions  of 
performance  evaluation  in  MIS  development,  while  Section  B  presents  metric  operationalizations 
of  these  dimensions  gleaned  from  two  mini-case  studies.  Section  C  then  interprets  the  case  study 
data  in  light  of  the  model  results. 

A.    Performance  Evaluation  in  IS  Development 

The  principal  will  seek  to  motivate  the  agent  to  take  actions  that  will  increase  gross  benefits  and  to 
decrease  costs.^  It  is  assumed  that  higher  effort  on  the  part  of  the  agent  increases  the  expected 
value  of  the  gross  benefits  to  the  principal.  Going  beyond  a  general  cost-benefit  framework,  in  an 
IS  development  context  the  costs  and  benefits  have  both  long  term  and  short  term  components.  On 


^The  emphasis  here  is  on  the  use  of  a  set  of  metrics  to  evaluate  performance.  The  form  of  the  actual  reward,  be  it 
cash,  stock  options,  promotion,  time  off,  etc.,  will  clearly  vary  due  to  individual  preference,  prevailing  industry 
norms,  etc.,  and  will  not  be  considered  here. 

^Cooper  and  Mukhopadhyay,  in  a  recent  review  and  analysis  of  potential  IS  effectiveness  evaluation  approaches,  note 
that  only  three  approaches,  cost/benefit  analysis,  information  economics,  and  microeconomic  production  functions 
are  suitable  for  use  in  performance  evaluation,  and  that  of  these,  only  the  fu^t  is  of  current  practical  applicability 
[Cooper  and  Mukhopadhyay  1990,  p.  5  and  Figure  1].  Therefore,  for  illustrating  the  model  in  terms  of  current 
practice,  the  focus  is  on  the  cost/benefit  approach  to  performance  evaluation. 


the  cost  side  the  short  term  costs  are  primarily  those  associated  with  systems  development,  most 
prominently  the  labor  costs.  However,  there  is  also  a  longer  term  maintenance  cost  associated  with 
each  system.  Numerous  studies  have  shown  that  over  half  of  all  systems  moneys  are  spent  on 
maintenance  [Elshoff  1976]  [Lientz  and  Swanson  1980]  [Boehm  1987]  and,  most  recently,  that  for 
every  dollar  spent  on  development,  nine  will  be  spent  on  maintenance  [Corbi  1989].  While  many 
factors  (including  exogenous  factors  such  as  future  changes  in  the  business  environment)  may 
effect  maintenance  costs,  for  information  systems  development  contracting  purposes  the  principal 
can  only  attempt  to  ensure  that  the  system  developed  by  the  agent  can  be  maintained  at  the  least 
possible  foreseeable  cost. 

Similarly,  the  benefits  side  of  the  equation  has  both  a  short  and  long  term  component  The 
principal  requesting  the  system  clearly  can  begin  to  benefit  only  when  the  system  is  completed. 
Further,  the  business  use  of  the  new  system  has  to  be  coordinated  with  several  other  business 
activities,  and  considerable  other  resources  have  to  be  committed  at  the  anticipated  implementation 
time  for  the  system.  Therefore,  if  the  system  is  deUvered  on  time,  the  principal  is  hkely  to  be  better 
off,  ceteris  paribus,  than  if  it  were  delivered  late.  This  corresponds  to  the  notion  of  timeliness,  the 
ability  to  deliver  the  system  on  or  before  the  deadline.  However,  in  the  long  term,  the  ultimate 
value  of  the  system  may  be  due  to  the  provision  of  user-desirable  functionality.  This  is  the  notion 
of  effectiveness,  which  can  only  be  interpreted  in  a  longer  term  context 

Therefore,  for  model  illustration  purposes,  the  focus  is  on  four  outcomes  for  the  principal  to  apply 
the  efforts  of  the  agent  represented  as  xj  (negative  of  development  cost),  \2  (maintainability),  X3 
(timeliness),  and  X4  (effectiveness)"^.  These  are  perhaps  best  presented  by  means  of  a  2x2  matrix: 


SHORT  TERM 

LONG  TERM 

COST 

Initial  Development  Cost 

Maintainability 

BENEFIT 

Timeliness 

Effectiveness 

Table  1:  Classification  Matrix  of  IS  Development  Project  Outcomes 


^*^ote  that  the  research  problem  of  interest  here  is  the  measurement  of  project  results,  which  are  the  principal's 
typical  concern,  especially  in  the  case  of  an  external  agent  These  organizational  level  effects,  (e.g.,  the  degree  to 
which  a  project  ftirthered  the  professional  development  of  the  project  staff  (in  order  to  possibly  increase  their  value 
on  future  projects)  or  the  degree  to  which  a  project  followed  site  technical  standards  (in  hopes  of  increasing  future 
inter-operabihty  and  hence  productivity))  are  only  secondary  effects  in  terms  of  an  individual  p-oject  and  therefore  are 
not  considered  here. 
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Table  1  presents  four  outcomes  (x)  of  interest  to  the  principal  requesting  the  information  system. 
The  principal  and  the  agent  must  jointly  agree  on  a  set  of  performance  evaluation  metrics,  y,  for  the 
compensation  contract  If  the  x  are  observable  by  both  the  principal  and  the  agent  in  the 
contractual  period,  then  these  may  serve  as  the  y.  However,  if  that  is  not  the  case,  then  the 
principal  and  the  agent  must  determine  surrogate  metrics  that  are  jointiy  observable. 

B.     Performance  Evaluation  Metric  Operationalizations 

Two  sites  were  selected  as  mini-case  studies  to  determine  the  type  and  extent  of  measurement  used, 
one  an  internal  development  cwganization  and  one  an  external  firm.  While  the  two  sites  represent  a 
convenience  sample,  they  are  believed  to  be  representative  of  typical  current  practice  in  information 
systems  development^  ^ 

The  internal  organization  is  located  within  a  large  commercial  bank.  The  information  systems 
development  group  consists  of  approximately  450  professional  staff  members  who  work  at 
developing  and  maintaining  financial  application  software  for  the  bank's  internal  use.  The 
applications  are  largely  on-line  transaction  processing  systems,  operating  almost  exclusively  in  an 
IBM  mainframe  COBOL  environment.  The  bank's  systems  contain  over  10,000  programs, 
totalling  over  20  million  lines  of  code.  The  programs  are  organized  into  application  systems  (e.g., 
Demand  Deposits)  of  typically  100  -  300  programs  each.  Some  of  the  bank's  major  application 
systems  were  written  in  the  mid-1970's  and  are  generally  acknowledged  to  be  more  poorly 
designed  and  harder  to  maintain  than  more  recently  written  software.  The  bank  has  made  some 
attempts  to  upgrade  its  systems  development  capability.  These  steps  include  the  introduction  of  a 
commercial  structured  analysis  and  design  methodology,  the  institution  of  a  formal  software  reuse 
library,  and  the  piloting  of  some  CASE  tools. 

The  external  organization  is  a  major  systems  consulting  and  integration  firm  that  operates 
nationally.  Their  staff  consists  of  over  2000  systems  development  professionals  who  are  recruited 
from  the  leading  colleges  and  universities.  They  develop  custom  applications  and  sell  customizable 
packages  to  a  variety  of  public  and  private  clients.  Their  various  divisions  are  organized  around  a 
small  number  of  specific  industries,  such  as  financial  services.  These  divisions  tend  to  focus  on 
software  and  hardware  platforms  that  are  widespread  in  their  respective  market  segments,  although 
there  is  some  firmwide  commonality  across  divisions  via  a  standardized  development  methodology 
and  toolset  An  emphasis  is  placed  on  very  large  systems  integration  projects  that  are  often  multi- 


^  'See  Section  IV.A  for  some  external  validation  of  this  assumption. 
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year  engagements.  A  state  of  the  art  development  environment  is  maintained,  with  the  firm  being 
an  early  adopter  of  most  software  engineering  innovations. 

At  the  bank,  development  cost  is  tracked  through  a  project  accounting  system  that  is  used  to 
chargeback  systems  developer  hours  to  the  requesting  user  department.  Hours  are  charged  on  a 
departmental  average  basis,  with  no  allowance  for  the  skill  or  experience  level  of  the  developer 
being  incorporated  into  the  accounting  system.  Mainframe  computer  usage  is  also  charged  back  to 
the  user,  at  a  'price'  designed  to  fully  allocate  the  annual  cost  of  operating  the  data  center  to  the 
users.  However,  labor  costs  are  generally  believed  to  constitute  eighty  percent  of  the  cost  at  this 
organization  [Kemerer  1987].  At  the  consulting  firm,  development  costs  are  tracked  through  a 
sophisticated  project  accounting  and  billing  systems,  with  the  main  entry  being  the  bi-weekly 
timesheets  of  the  professional  staff,  who  may  be  simultaneously  working  on  multiple  projects  for 
different  clients.  Other  direct  project  charges  are  also  administered  through  this  system,  especially 
travel.  Development  is  typically  done  at  the  client's  site,  and  therefore  hardware  chargeback  is 
often  unnecessary. 

Long  term  maintenance  costs  are,  in  part,  a  function  of  the  maintainability  of  a  system  [Banker  et 
al.  1991a].  While  there  are  many  factors  outside  the  control  of  both  the  principal  and  the  agent  that 
can  affect  maintenance  costs  (e.g.,  changes  in  external  business  conditions  such  as  regulatory 
changes),  the  principal  desires  that  the  agent  deliver  a  system  that  can  be  maintained  at  the  least 
possible  cost.  Therefore,  the  outcome  that  is  desired  is  a  high  level  of  maintainability. 
Unfortunately,  even  the  growing  recognition  of  the  significant  magnitude  of  maintenance  efforts 
has  not  yet  produced  a  well-accepted  metric  for  maintainability.  The  closest  approximation  to  such 
a  notion  are  the  class  of  software  metrics  known  as  complexity  metrics  [McCabe  1976]  [Halstead 
1977]  [Henry  and  Kafura  1981]  [Banker  et  al.  1991c].  The  general  notion  is  that,  as  systems 
become  more  complex  over  time  (so-called  "system  entropy",  [Belady  and  Lehman  1976]),  they 
become  more  difficult  to  maintain.  The  various  complexity  metrics  provide  a  means  of  measuring 
this  complexity,  and  therefore  can  be  used  both  to  predia  inaintenance  costs,  and  as  an  input  to  the 
repair/rewrite  decision  [Gill  and  Kemerer  1991]  [Banker era/.  1991a]  [Banker era/.  1991b]. 

At  the  bank,  while  maintenance  projects  are  recognized  as  the  primary  information  systems 
development  activity,  no  attempt  is  currently  made  to  measure  and  control  for  the  maintainability  of 
the  applications.  Similarly,  at  the  consulting  firm  no  maintainability  measures  are  tracked,  even 
though  the  ongoing  maintenance  of  the  developed  system  by  the  firm  is  a  requirement  of  many 
projects. 
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On  the  benefit  row  of  Table  1,  the  short  run  benefit  is  provided  by  delivering  the  system  on 
schedule,  what  is  referred  to  as  system  timeliness.  Of  course,  the  appropriate  duration  of  a 
systems  development  project  is  very  much  dependent  upon  such  factors  as  the  size  of  the  system 
and  the  productivity  of  the  development  staff.  Therefore,  the  timeliness  metric  is  generally  stated 
in  relative  terms,  rather  than  absolute  terms,  most  typically  in  relation  to  a  deadline.  Thus,  a 
system  is  delivered  "on  time"  or  "2  months  late".  Of  course,  this  metric  is  really  a  difference 
result,  and  therefore  an  agent  seeking  to  minimize  the  difference  can  direct  effort  both  towards 
maximizing  the  time  period  (deadline)  allowed  during  the  project  planning  stage,  as  well  as 
towards  actually  developing  the  system  in  such  a  way  as  to  minimize  the  delay  from  the  delivery 
date.  However,  a  tendency  on  the  part  of  developers  to  estimate  or  propose  excessively  long 
development  times  will  be  mitigated  by  other  controls,  i.e.,  an  extemal  developer  is  unlikely  to  be 
awarded  such  a  contract,  and  an  in-house  developer  may  find  that  the  principal  chooses  not  to  do 
the  system  at  all.  Therefore,  a  timeliness  metric  can  be  assumed  to  provide  at  least  partial 
motivation  to  develop  the  system  promptly. 

At  the  bank,  project  schedules  are  published  and  the  larger  projects  are  tracked  via  a  regular  status 
meeting  chaired  by  the  most  senior  vice  president  in  charge  of  the  information  systems  function. 
Project  adherence  to  intermediate  milestones  is  checked,  and  late  projects  are  flagged  for 
discussion.  At  the  consulting  firm,  adherence  to  schedule  is  monitored  through  use  of  a 
development  methodology  with  standardized  milestones.  Deliverable  deadlines  are  an  important 
part  of  many  contracts,  with  clients'  desire  to  implement  systems  by  certain  fixed  dates  a  key 
contributor  to  their  decision  to  use  an  extemal  developer. 

The  fourth  and  final  cell  in  Table  1  is  long  term  benefit,  or  effectiveness.  Effectiveness  metrics  are 
much  sought,  but  little  or  no  general  practitioner  agreement  has  been  reached  on  such  metrics. 
Crowston  and  Treacy  note  that:  "Implicit  in  most  of  what  we  do  in  MIS  is  the  belief  that 
information  technology  (IT),  has  an  impact  on  the  bottom  line  of  the  business.  Surprisingly,  we 
rarely  know  if  this  is  true"  [1986,  p.  299].  They  go  on  to  review  the  existing  literature  in  this  area 
for  the  previous  ten  years  and  conclude  that  until  more  progress  is  made  in  identifying  performance 
variables,  that  the  best  current  metrics  can  only  test  whether  systems  engender  user  satisfaction  or 
usage.  Therefore,  commonly  accepted  effectiveness  metrics  tend  to  take  the  form  of  surveys  of 
user  satisfaction  [Bailey  and  Pearson  1983]  [Ives  et  al.  1983]  [Jenkins  and  Ricketts  1985].  This 
work  has  been  criticized  as  not  being  theoretically  based  [Chismar  et  al.  1986]  [Melone  1990]  and 
the  results  of  these  surveys  will  be  subject  to  the  users'  prior  expectations  about  the  system. 
Newer  work  proposes  'user  satisfactoriness'  as  a  theoretically-based  alternative  metric  surrogate 
for  system  effectiveness  [Goodhue  1986]. 
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At  the  bank,  no  formal  mechanisms  are  in  place  to  measure  user  satisfaction,  although  occasional 
efforts  are  made  to  interview  key  users  about  their  needs.  At  the  consulting  firm,  while  user 
satisfaction  is  deemed  to  be  highly  relevant  in  terms  of  its  Unkage  to  follow-on  contracts,  until  very 
recently,  no  standardized  mechanism  existed  to  capture  this  information.  Of  course,  contractual 
provisions  typically  guarantee  some  minimum  level  of  performance.  Beyond  this,  a  small  number 
of  newer  projects  are  experimenting  with  a  user  satisfaction  survey. 

C.    Application  of  Model  Results 

The  results  of  the  previous  sections  are  now  combined  by  applying  the  measurement  criteria  finom 
the  nxxlel  to  the  commonly  used  operationalizations  of  IS  performance  evaluation  metrics.  From 
this  application  some  observations  are  made  with  regard  to  the  model  criteria  about  the  relative 
emphasis  on  the  current  operationalizations  in  practice. 

C.l  Precision 

The  first  criterion  is  the  precision  of  the  performance  evaluation  metrics.  Precision's  two  principal 
component  parts,  lack  of  controllability,  as  defined  by  Var(xjla),  and  lack  of  accuracy,  as  defined 

by  Var(ei)  are  considered. 

The  relative  controllability  of  the  four  performance  evaluation  criteria  are  fairly  clear.  The  agent 
has  strong  control  over  the  development  costs  through  his  actions  to  manage  them.  Timeliness  and 
maintainability  are  less  under  the  agent's  control,  as  the  relationships  between  efforts  taken  and  the 
results  achieved  are  less  well- understood  than  those  of  costs  [Abdel-Hamid  and  Madnick  1991]. 
Finally,  least  controllable  of  all  from  the  IS  development  agent's  perspective  is  the  system's 
effectiveness.  The  system  may  have  been  poorly  conceived  initially  by  the  requestor,  and 
therefore,  the  delivered  system,  while  perhaps  meeting  the  agreed  upon  technical  specifications, 
may  not  prove  to  be  valuable.  Or,  the  principal  may  do  an  inadequate  job  making  the 
organizational  changes  necessary  for  the  success  of  the  new  system,  e.g.,  re-assignment  of  tasks, 
re-training,  and  adjustment  of  compensation  systems.  In  support  of  these  notions  there  is  a 
growing  body  of  descriptive  research  that  suggests  that  many  completed  systems  are  never  used 
[Rothfeder  1988]  [Kemerer  and  Sosa  1991].  Finally,  if  user  satisfaction  metrics  are  used,  it  may 
be  in  the  interests  of  the  user  to  not  report  satisfaction  as  high,  in  order  to  extract  additional  effort 
or  attention  firom  the  developer. 

In  the  accuracy  component,  the  short-term  measures  clearly  allow  for  more  accuracy  than  the  long- 
term  measures.  Project  management  systems  routinely  track  project  expenditures  and  deadlines, 
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and  these  provide  metrics  that  are  relatively  objective  and  accurate  versus  either  maintainability 
(subject  to  limitations  in  measurement  and  the  impact  of  the  unknown  nature  of  future  change 
requests)  or  effectiveness  (subject  to  the  lack  of  reliability  of  the  measurement  instrument  and  the 
unknown  impact  of  future  changes  in  the  business).  The  problems  with  maintainability  and 
effectiveness  introduce  a  large  degree  of  noise  into  the  process,  thus  reducing  the  precision  of 
metrics  for  those  performance  evaluation  variables. 

Summing  these  two  components  of  precision,  controllability  and  accuracy,  it  can  be  seen  that 
development  cost  scores  relatively  the  best  on  both  components,  while  effectiveness  scores 
relatively  the  worst  Timeliness  and  maintainability  rate  in  the  middle  of  these  two  extremes  in 
terms  of  their  precision. 

C.2  Sensitivity 

Development  cost,  operationalized  at  both  sites  primarily  as  labor  work  months,  is  a  sensitive 
metric,  that  is,  it  possesses  a  relatively  high  value  for  |J.i3mi(ai)/3ai.  A  project  manager  can 
change  the  expected  development  cost  by  deciding  which  staff  members  are  to  be  assigned^^  and 
how  they  are  to  be  deployed,  and  by  providing  leadership  and  supervision  during  the  development 
process.  In  addition,  a  manager  may  also  influence  project  cost  by  under-reporting  his  own  hours, 
as  a  means  of  adding  value  to  a  project  without  exceeding  the  budget. 

The  other  short  term  measure,  timeliness,  as  operationalized  by  the  degree  to  which  the  deadline 
was  met,  is  relatively  less  sensitive  than  cost  to  the  agent's  increased  effort.  While  assigning  less 
or  more  expensive  personnel  directly  affects  the  project  cost,  the  influence  on  timeliness  is  less 
direct  Brooks's  research  has  been  summarized  into  the  aphorism  that  "adding  staff  to  a  late 
project  makes  it  later,"  denying  the  ability  of  the  agent  to  move  the  timeliness  metric  in  the  desired 
direction  in  a  substantial  way  [Brooks  1975].  In  essence,  given  the  current  state  of  practice  in 
software  project  management,  small  efforts  at  the  margin  on  the  part  of  the  agent  cannot  effect  as 
large  a  change  in  the  timeliness  metric  as  could  similar  efforts  directed  toward  the  development  cost 
metric.  This  result  depends  in  part  upon  the  project  specification  being  sufficiently  concrete  as  to 
disallow  the  possibility  of  significant  "gaming",  i.e.,  undocumented  reduction  in  scope  that  allow 
the  appearance  of  on  time  delivery  of  what  in  reality  is  significantly  reduced  functionality.  This  is 
the  situation  at  both  of  the  case  study  sites,  particularly  the  external  consulting  firm  where  formal 
contracts  arc  the  norm. 


'^This  assumes  that  different  levels  of  staff  are  charged  at  different  rates,  a  typical  af^roach,  but  one  that  is  not 
universal,  particularly  among  internal  IS  development  groups. 
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In  terms  of  the  longer  term  metrics,  the  cost  side  is  reflected  by  maintainability,  possibly 
operationalized  by  complexity  metrics,  (although  not  done  at  either  site)  and  the  benefit  side  is 
referred  to  as  effectiveness,  possibly  operationalized  by  user  satisfaction  (although  not  done  at 
either  site).  Maintenance,  despite  its  growing  economic  importance,  is  a  relatively  unstudied  and 
therefore  poorly  understood  phenomenon.  Metrics  for  measuring  maintainability  are  in  their 
infancy,  and  therefore  the  relationships  among  agents'  efforts  and  their  impact  on  maintainability  or 
the  complexity  metrics  are  even  less  well  understood.  The  project  manager's  ability  to  influence 
maintainability  is,  therefore,  limited.  Thus,  the  sensitivity  of  maintenance  metrics  can  only  be 
described  as  low. 

Conversely,  the  user  satisfaction  metrics  used  to  indicate  effectiveness  should  show  relatively  high 
sensitivity.  Often,  the  inclusion  of  a  seemingly  small  feature  can  greatly  improve  the  user's 
perceived  or  even  actual  value  for  the  application.  If  the  IS  development  agent  is  aware  of  user 
needs  and  preferences,  particularly  regarding  user  interface  issues,  he  is  often  able  to  greatly 
influence  user  satisfaction. 

In  summary,  the  relative  sensitivity  of  the  commonly  used  metrics  are  as  follows.  Development 
cost  and  user  satisfaction  will  exhibit  relatively  high  sensitivity,  with  timeliness  being  less  sensitive 
than  either  of  them.  Finally,  maintainability,  with  the  current  poor  understanding  of  the 
relationship  between  complexity  and  maintenance,  is  relatively  the  least  sensitive  of  the  four. 

C.3  Suinmarv 

In  examining  all  of  the  performance  evaluation  metrics  relative  to  the  criteria  defined  by  the  model,  it 
can  be  seen  that  development  cost  as  a  metric  ranks  high  in  terms  of  both  its  sensitivity  and 
precision.  Timeliness  as  measured  by  meeting  deadlines  is  also  sensitive,  but  somewhat  less 
precise.  Effectiveness  seems  sensitive,  but  fares  poorly  in  terms  of  its  precision,  while 
maintainability  is  only  moderately  sensitive  and  moderately  precise.  Therefore,  using  the  result 
from  Equation  12,  an  ordinal  ranking  of  performance  evaluation  metrics  would  list  development 
cost  and  deadlines  a  close  first  and  second,  followed  by  user  satisfaction  and  then  followed  by 
maintainability,  as  shown  in  Table  2. 
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Metric 

Sensitivity 

Precision 
Control               Accuracy 

Ordinal  Score 

Budget 
performance 

High 

High 

High 

la 

Schedule 
performance 

High 

Medium 

High 

lb 

User 
Satisfaction 

High 

Low 

Medium 

2 

Maintenance 
complexity 

Low 

Medium 

Medium 

3 

Table  2:  Relative  Metric  Values 

To  summarize,  this  ranking  is  based  on  observations  at  the  two  sites.  In  practice,  both  sites 
emphasize  measurement  on  two  dimensions,  cost  and  timeliness,  that  are  seen  to  possess  relatively 
the  most  precision  and  sensitivity,  as  predicted  by  the  model.  How  these  dimensions  might  fare  at 
other  sites,  or  at  these  same  sites  in  the  future,  are  discussed  in  Section  IV. 

IV.    DISCUSSION 

In  this  section  the  generalizability  and  implications  of  the  results  shown  in  Section  in  are 
discussed.  Limitations  of  the  model  and  possible  extensions  to  it  are  also  presented. 

A.  Generalizability  and  Implications  of  the  Results 

In  examining  the  results  presented  in  Section  HI  one  possible  concern  might  be  with  the 
representativeness  of  the  two  mini-case  studies.  Even  though  their  practices  are  in  line  with  what 
the  model  would  predict,  to  what  degree  are  they  believed  to  be  representative  of  current  practice? 

Three  other  sources  of  data  on  the  current  state  of  measurement  suggest  that  the  two  mini-cases  are 
quite  typical  of  current  practice.  In  a  survey  of  over  140  medium  to  large  IS  departments 
conducted  in  1988,  managers  were  asked  what  measures  they  currently  used  [Howard  1988].  By 
far  the  leading  measures  were  work-hours  per  project,  a  measure  of  development  cost  (78%  of 
managers  surveyed),  and  adherence  to  delivery  dates,  a  measure  of  timeliness  (72%).  The  third 
most  used  measure  was  computer  resource  usage,  which  was  only  mentioned  by  27%  of  the 
respondents.  All  other  measures  were  less  frequentiy  reported,  and,  in  particular,  "module  size",  a 
potential  measure  of  maintainability,  was  reponed  by  only  8%  of  the  respondents. 


Another  independent  source  is  some  descriptive  data  from  the  forthcoming  text  by  Jones  [1991]. 
His  reports  about  the  status  of  software  measurement  in  various  industries  are  worth  quoting  in 
fuU: 
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"Companies  such  as  Exxon  and  Amoco  were  early  students  of  software  productivity 
measurement ,  and  have  been  moving  into  ...  user  satisfaction  as  well. ..The  leading 
insurance  companies  such  as  Hartford  Insurance,  UNUM,  USF&G,  John  Hancock,  and 
Sun  Life  Insurance  tend  to  measure  productivity,  and  are  now  stepping  up  to  ...  user 
satisfaction  measures  as  we II..  J n  the  manirfacturing,  energy,  and  wholesale/retail  segments 
the  use  of  software  productivity  measurement  appears  to  be  proportional  to  the  size  of  the 
enterprise:  the  larger  companies  with  more  than  a  thousand  software  professionals  such  as 
Sears  Roebuck  andJ.C.  Penney  measure  productivity,  but  the  smaller  ones  do  not. ...user 
satisfaction  measurement  are  just  beginning  to  heat  up  within  these  industry 
segments... Companies  such  as  Consolidated  Edison,  Florida  Power  arui  Light,  and 
Cincinnati  Gas  and  Electric  are  becoming  fairly  advanced  in  software  productivity  measure. 
Here  too, ...  user  satisfaction  measures  have  tended  to  lag  behind."   [1991,  pp.  22-24] 

Finally,  Watts  Humphrey,  in  his  work  on  software  process  maturity,  notes  that  the  first  measures 
adopted  by  organizations  are  cost  and  schedule  metrics.  It  is  not  until  stage  four  of  the  five  stage 
model  that  more  comprehensive  measures  are  expected  to  be  implemented  [Humphrey  1988,  p. 
74]. 

These  independent  observations  document  what  is  predicted  by  the  model.  Productivity  measures, 
be  they  cost  or  time-related,  are  in  wide  use,  while  effectiveness  measures  in  the  form  of  user 
satisfaction  or  quality  metrics,  are  less  widely  adopted.  Measures  of  maintainability  are  completely 
absent  from  these  discussions,  which  is  consistent  with  the  results  in  Table  2  which  suggest  that 
they  are  the  least  likely  of  the  four  to  be  adopted. 

The  implications  for  this  choice  of  adoption  are  worthy  of  managerial  concern.  The  emphasis  on 
short-term  results  will  tend  to  produce  decisions  on  project  planning,  staffing,  and  technology 
adoption  that  are  sub-optimal  for  the  organization  in  the  long-term.  For  example,  the  total  lack  of 
measurement  of  the  maintainability  impacts  of  project  decisions  implies  that  only  minimal  effort 
will  be  devoted  towards,  for  example,  useful  design  and  code  documentation  or  adherence  to 
structured  coding  precepts,  to  the  extent  that  these  activities  are  viewed  as  costly  or  otherwise 
compete  for  resources  with  different  activities  that  are  measured.  Similarly,  an  emphasis  on 
schedule  and  budget  measurements  in  preference  to  effectiveness  measures  implies  an  emphasis  on 
delivering  any  product  on-time,  rather  than  a  better  product  later,  where  this  latter  option  might  be 
the  preferred  alternative  for  the  organization. 

B.  Limitations  and  Possible  Extensions  to  the  Results 

The  discussion  so  far  has  been  limited  to  application  of  the  results  of  the  model  to  examples  in 
traditional  information  systems  development  that  are  currently  observed.  Two  obvious  extensions 
to  this  analysis  would  be  to  apply  the  model  to  (a)  different  development  environments,  and  (b) 
speculate  as  to  future  trends  that  may  have  some  impact  on  these  results. 
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Different  environments  may  have  available  metric  operationalizations  that  exhibit  higher  precision 
or  sensitivity  or  both  versus  their  counterparts  in  traditional  information  systems.  For  example,  the 
effectiveness  dimension  is  traditionally  perceived  as  difficult  to  quantify.  However,  in  another 
environment  this  may  not  be  the  case.  For  example,  in  a  safety  critical  application,  such  as  real- 
time control  of  a  nuclear  power  plant,  software  reliability  may  be  the  overwhelming  criteria,  and 
therefore  the  degree  to  which  the  software  has  been  tested  and  can  be  'proven'  correct  may  swamp 
all  other  possible  effectiveness  considerations.  To  the  degree  that  metrics  for  reliability  exhibit 
higher  control  and  accuracy  relative  to  the  equivalent  user  satisfaction  metric  of  traditional 
information  systems,  and  to  the  degree  that  reliability  is  a  highly  valued  outcome  dimension,  it  will 
be  weighted  more  heavily.  Another  example  might  be  the  effectiveness  of  a  real-time  military  fire 
control  system  which  may  depend  almost  solely  on  the  operational  performance  (speed)  at  which  it 
operates.  This  may  lend  itself  to  easily  definable  metrics  that  possess  desirable  properties. 

One  change  that  may  occur  over  time  within  the  information  systems  environment  is  greater 
recognition  of  the  ability  to  measure  and  improve  software  maintainability.  While  the  importance 
of  the  maintenance  activity  has  been  recognized  for  over  a  decade  [Lientz  and  Swanson  1980],  it  is 
only  recently  that  research  has  linked  measures  of  complexity  to  maintainability  [Gibson  and  Senn 
1989]  [GiU  and  Kemerer  1991]  [Banker  et  al.  1991a]  [Banker  et  al.  1991b].  This  realization  has 
been  accompanied  by  the  commercial  availability  of  automated  tools  that  deUver  the  metric  values 
[Babcock  1986]  [McAuliffe  1988].  To  the  degree  to  which  these  static  analysis  tools  are  delivered 
within  CASE  environments,  rather  that  having  to  be  justified  and  purchased  as  stand-alone  tools, 
their  use  can  be  expected  to  increase.  Therefore,  over  time  a  greater  understanding  of  software 
complexity  metrics  as  operationalizations  of  the  maintainability  dimension  may  improve  the  relative 
sensitivity  of  this  metric. 

A  further  interpretation  of  the  results  from  the  model  would  be  to  move  beyond  the  positive  or 
descriptive  aspects  and  interpret  the  results  in  a  normative  manner,  i.e.,  where  current  metrics  are 
deficient  indicates  where  further  effort  is  needed.  This  would  argue  for  greater  emphasis  on 
metrics  for  both  effectiveness  and  maintainability,  as  these  are  the  two  dimensions  least  well 
represented  by  current  metrics.  It  should  be  noted  that  this  result  which  is  derived  from  the  agency 
theory  perspective  matches  well  with  some  current  calls  from  practitioners  for  better  measures  of 
the  'business  value'  of  IS  development  [Banker  and  Kauffman  1988].  For  example,  the 
effectiveness  dimension  would  be  likely  to  be  measured  more  often  if  there  were  a  metric  that 
possessed  better  characteristics  than  the  current  user  satisfaction  metric. 
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A  more  general  result  that  follows  from  this  same  line  of  reasoning  is  that,  given  the  poor  qualities 
of  the  currently  available  outcome  metrics,  it  is  not  surprising  to  see  an  emphasis  in  practice  on 
behavior-based  metrics  (e.g.,  tracking  of  hours-not-to-exceed-n;  timebox  management).  This  has 
a  number  of  implications,  particularly  for  the  viabiUty  of  outsourcing.  Given  the  impossibility  of 
directiy  monitoring  outsourced  development,  for  contracting  purposes  the  principal  would  be 
greaUy  aided  by  the  availability  of  outcome  metrics  with  desirable  properties  such  as  sensitivity  and 
precision.  The  current  deficiency  in  this  area  may  slow  the  growth  of  adoption  of  systems 
development  outsourcing. 

V.  CONCLUDING  REMARKS 

This  paper  has  developed  a  principal-agent  model,  grounded  in  information  economics  theory,  that 
provides  a  common  conceptual  framework  to  illuminate  current  and  future  practice  with  regard  to 
performance  evaluation  metrics  for  information  system  development.  Given  the  principal-agent 
nature  of  most  significant  scale  IS  development,  insights  that  will  allow  for  greater  alignment  of  the 
agent's  goals  with  those  of  the  principal  through  incentive  contracts  will  serve  to  make  IS 
development  both  more  efficient  and  more  effective.  An  important  first  step  in  this  process  is 
gaining  a  better  understanding  of  the  behavior  of  the  metrics  used  in  contracting  for  IS 
development  The  insights  available  from  the  model  both  suggest  explanations  as  to  the  current 
weighting  of  the  dimensions  of  IS  development  performance,  and  provide  insights  into  where 
better  metrics  are  needed  if  the  current  largely  unsatisfactory  situation  is  to  be  remedied. 

In  terms  of  future  research,  a  natiual  follow-on  would  be  to  perform  an  empirical  validation  of  the 
proposed  relative  weightings  given  a  set  of  performance  evaluation  metrics.  This  will  require  the 
development  of  an  instrument  to  measure  the  model's  sensitivity  and  precision  constructs. 

The  ultimate  value  of  such  research  will  be  in  an  increased  understanding  of  how  best  to  evaluate 
current  systems  development  performance,  so  as  to  provide  guidance  to  managers  on  how  best  to 
improve  that  performance.  Given  the  key  role  played  by  systems  development  in  enabling  strategic 
uses  of  information  technology,  such  improvement  is  of  critical  importance  to  the  management  of 
organizations. 
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